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SEQUENTIAL RANKING PROCEDURES 
By 

Elias Alphonse Parent, Jr. 



1. Introduction . Many statistical procedures used and studied 
today are sequential in nature. By this we mean that the time when a 
statistical decision is reached is random. In contrast to such proce- 
dures are the fixed sample size procedures. Best known perhaps is 
sequential analysis and the sequential probability ratio test as formu- 
lated by Wald [6]. There are other sequential procedures, for example 
in process inspection schemes, where, based on a sequence of observations 
a decision is made to stop the process and take some adjusting action, 
the time at which the process is stopped being a random variable. There 
are many other sequential-like procedures. 

In the theory of hypothesis testing for the case of a simple hypo- 
thesis against a simple alternative it is known that a most powerful test 
can be determined by the Neyman- Pears on lemrna, which is of the form: 



f l^ X l' X 2' ; X n^ 

reject f = f Q if = f ( X X , ... / > K 

0 1 d n 

where the hypotheses to be tested are f = f^ against f = f ^ , f^ and 

f., are the joint densities of the observations X_ , X 0 , .... X , corre- 
1 1 2' n 

sponding to each hypothesis. This is an example of a nonsequential 
procedure. To extend such a procedure to the sequential idea we need 
only modify the test as follows: 

take a sample of size of size m and 
reject f Q if ^ 

accept f_ if A < K 0 

draw another sample of size n-m if < A^ < 



1 



if the second sample is required compute and 



reject 


f o 


if 


A 

n 


> K 


accept 


f o 


if 


A 

n 


< K . 



Such a simple modification gives us a two stage procedure with a new 
feature in that the total sample size is random, being either m or n, 
depending upon the outcome of the first stage. This basic idea of a 
sequential test was proposed by Dodge and Romig in [8], and has been 
extended to multiple stage sampling plans. 

Sequential hypothesis testing as proposed by Wald requires that 
a computation of and a decision be made as each observation is 

taken. Briefly, to test f = f against f = f ^ select constants 
B < A and compute A^ as each observation is taken, and proceed 
according to the rule 



if 


A > A 

n — 


reject 


f “ f o 


if 


A < B 

n — 


reject 


f = f l 


if 


B < A 


< A take 


another observation and compute 



n n+JL 

Since the sequential probability ratio test is formulated in 
terms of the ratio which leads to most powerful tests according to 
the Neyman-Pearson theory we would expect it to have good properties. 
This indeed is the case in that of all tests with the same power the 
sequential probability ratio test requires on the average fewest obser- 
vations. This optimal property was conjectured by Wald and finally 
proved by Wald and Wolfowitz in [ 9 ] • 



2 



In order to carry out these sequential tests of hypotheses we note 
that an assumption as to the specific form of f^ and f^ must be 
made. It often happens that the form of the underlying distribution 
is not assumed known and in this case nonparametric statistical methods 
are used. In nonparametric statistics many tests of statistical hypo- 
theses are based on the set of ranks {T_, Th, ... , T ) determined 

r 2 ; 1 n 

from a random sample {X^, X^, ... , X^), or the signs of the obser- 
vations (+ 1 according as X_^ in positive or negative) or on a 
combination of both of these sets of statistics derived from the basic 
observations. The sign test, signed rank test, Wilcoxon-Mann-Whitney 
test, Fisher- Yates test and many others are examples of such fixed 
sample size nonparametric tests. 

Contrary to the case in parametric statistics (as opposed to non- 
parametric statistics) there are very few sequential procedures in 
nonparametric statistics, particularly sequential procedures based on 
signs, ranks, or both. One reason for this is that for most specified 
alternatives to the null hypothesis it is difficult to compute proba- 
bilities for statistics based on signs and ranks which in turn makes 
it difficult to properly evaluate the properties and operating charac- 
teristics of the procedures. This difficulty can be circumvented by 
restricting attention to special classes of alternatives such as those 
proposed by Lehmann in [1], where to the null hypothesis F(x) he 
proposed alternatives of the form F a (x), a > 0. This of course does 
not solve the basic problem of alternatives as the question of whether 
or not the Lehmann alternative is appropriate for the problem being 
considered arises . However it is a first step inasmuch as it does 
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allow us to develop some sequential procedures where exact distribution 
theory calculations are possible. In the fixed sample size problem it 
simplifies considerations of power of rank tests. 

An example of a nonparametric sequential test is the following 
adaptation of Wald’s sequential probability ratio test for binomial 
observations. Consider a sequence of independent identically distrib- 
uted random variables X^, X 0 , ... with cumulative distribution function 
F(t) = P(X 1 < t). We wish to test F(t 0 ) = p Q against F(t ) = p^ 
for some fixed value t^ . The number of observations less than or 
equal to t , say N, after taking n observations , is a binomial 
random variable with parameters F(t^) and n. The probability ratio 
reduces to 



P(»l F(t 0 ) = Pl ) 

' ' n P(N| F(t 0 ) = P Q ) “ U 0 1-Pi ) 1-P 0 

and the sequential test based on this ratio is discussed in Wald [6]. 

For the special case where t^ = 0, N is equivalent to the number of 
negative observations after n trials and this would be a sequential 
test based on the signs of the observations. 

An example of a nonparametric sequential procedure based on ranks 
of observations is the grouped rank test developed by Wilcoxon, Rhodes 
and Bradley [4]. Actually two sequential procedures are developed in 
[4], the Configural Rank Test and the Rank Sum Test. Basically, obser- 
vations are taken in groups of m X’s and n Y’s and the observations 
are ranked within each group. For each group a statistic is computed 
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based on the ranks and Wald's sequential probability ratio test is 
applied to the sequence of statistics so generated. Each group of m 
X's and n Y's becomes the basic unit used in the probability ratio. 
Suppose the X- population has distribution F(x) and the Y- population 
has distribution G(y), and observations are taken as follows 



( x n, x 12 > 

( X 2i> X 22> 



X lm' Y ir Y 12’ V ‘ grOUp 1 

X 2m' Y 21' Y 22, Y 2n } " grOUp 2 



( x n , X 72 , ••• , x 7m , Y y2 , ... , Y^) - group 7 



Let R 7 = (R yl , R 72 , , R 7m , S n , S^, ... , S^) be the rank 

vector associated with group 7 where R . is the rank of X . and 

7 1 7 1 

S . is the rank of Y . . the ranks taken from the combined ranking of 
7i 7 1* 

the X's and Y's. Taking a function of R , say = T(R^), we 
generate a new sequence of random variables ••• and the Wald 



sequential probability ratio test may now be applied to the . For 
independent group to group sampling we have 



( 1 . 2 ) 



n P(T 7 | Y ~ G(y) ) 

A n = H P(T | Y ~ F(y) ) 
7=1 



as the probability ratio to test the hypothesis that the Y- population 
has distribution F(y) against G(y) . In [4] the authors consider 
Lehmann alternatives G(y) = F^(y), k > 0 and the function T in 
one case is the actual configuration of X's and Y's, which is 
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equivalent to the vector (S^, S^, ... , S^), and in the second 
case T is taken to he the sum of the Y ranks. 

Wilcoxon, Rhodes and Bradley observe that the test could be 
improved by taking observations in pairs and reranking from the begin- 
ning each time a new observation pair is taken. One reason for the 
reduced efficiency of the group ranking method is that the observations 
in one group are not compared with observations from any other group. 
The reranking suggestion would take into account all comparisons. How- 
ever, this is very cumbersome, and moreover reranking introduces non- 
independence of successive probability ratios making an analysis of the 
properties of such a procedure difficult. 

Thus in order to attack the problem of nonparametric sequential 
tests of hypotheses based on ranks we should consider procedures such 
that the distribution theory is tractable and such that ranks are 
assigned in a truly sequential manner, avoiding as much as possible 
the complexities introduced by reranking. To this end two new sequen- 
tial ranking methods will be defined in this dissertation. 

In order to be led somewhat naturally to these new ranking methods 

we now consider the reranking procedure in more detail. Let T. . be 

J 

the rank of at the i stage in the reranking process. We 

observe X., , X^, ... , X , ... and each time a new observation is 
1' 2 7 7 n' 

taken the entire set of observations is reranked. We have 
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Observation vectors 



Rank vectors 







(X r x 2 , x 3 ) 






Notice that the vector (T^, T ... > T ) completely deter- 
mines the n rank vectors listed above in the sense that each vector 



the rank of X. relative to the set {X_, X_, ... , X.}. Thus we 
can rank an observation as it is observed, relative to the preceeding 
observations without reranking the previous observations and still 
retain the information contained in the n rank vectors which would 
come from reranking. This method of ranking observations is one way 
of assigning ranks which fits in naturally with the idea of sequential 
procedures and lends itself to developing sequential procedures in non- 
parametric problems. This ranking procedure also takes into account 
all comparisons among the observations. 

Analogous to the fixed sample size signed rank test we will define 
a second sequential ranking procedure based upon the absolute values of 
the observations and taking into account the signs of the observations. 
This signed sequential ranking procedure will be applied to a problem 
in process control. By process control we mean a procedure where the 
aim is to determine when a given sequence of random variables changes 



could be reconstructed given only T 



i = 1, 2, 



n. T. . is 

li 
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from being distributed according to a distribution F(x) to a different 
distribution G(x) . The term process control enjoys a broader definition 
today including those cases where the process is adjusted according to 
some statistic based upon the sequence of observations. Such proce- 
dures are referred to as adaptive control methods . 

The early methods used to control a process were based on control 
charts (Shewhart charts) and modifications of these control charts. 

To control the mean value of some dimension of a process at a particular 
value P q , samples of size n are taken at frequent intervals of time 
and the sample mean X is compared with P q + k a/ i/n . If X falls 
outside these lines the process is stopped and adjustments to the 

process are carried out, and for p - ko/Vn < X < p + k a/-/n the 

o — — o 

process is allowed to continue without adjustment. Modifications to 

the basic control chart method came in the form of "warning lines" 

inside the action lines p + k a/ Vn . Further modifications were 

o — 

introduced which changed the action rule to rules of the type "if K 

consecutive points on the chart fall outside control lines, take action." 

These early procedures failed to take advantage of all the information 

contained in the sequence X^, X ° • > X^. At best the modified 

action rules used only the information contained in a fixed number of 

sample values in the immediate past. 

In order to take advantage of this unused information the stopping 

rule should incorporate the entire sample . A step in this direction 

was taken by Page in [7] with the introduction of cumulative sum 

schemes. If the mean of a process is to be controlled the cumulative 
n 

sums S = £ (X. - k) are plotted on a chart against n. The entire 

i=l 
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history of the process is presented and changes in the process mean are 
visible through changes in direction of the mean path. To detect one- 
sided deviations in the mean, say increases, the stopping rule used is 
to stop the process when the current point of the path (n, r rises 
a given amount h > 0 above the previous lowest point of the path. 
Two-sided deviations are treated by applying two one-sided schemes 
simultaneously. For normal observations the cumulative sum schemes 
have been found to be more sensitive than the Shewhart control chart. 

When no assumption is made as to the form of the underlying dis- 
tributions we might look to non parametric methods for a control 
procedure. For example, the sequential rank of is equally likely 

to be 1, 2, ... , i as long as no change takes place in the distri- 
bution of X^, X^, ... , X_^. But when a location change takes place, 
say an increase in the process mean, larger ranks would be more probable. 
We will consider the sequential rank of |x_J relative to |x^|, |X 0 |,... 
| X | , multiplied by the sign of X^( + 1 if X^ > 0 and -1 if X^ < 0) 
in a process control problem. This method of sequentially assigning 
ranks, as noted before, will be called signed sequential ranking. 

This dissertation defines two methods of assigning ranks in a 
sequential manner to observations X^, X^ . . . • Basic properties of 

the sequential ranks are studied and distribution theory is determined. 
Section 2 contains some preliminary results including some relating to 
order statistics of observations taken from non identical distributions. 
These results are used in the later sections. In Section 3 the method 
of sequential ranking is defined and it is shown that for a fixed sample 
size, ordinary ranks and sequential ranks are equivalent for the purpose 
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of hypothesis testing* Section 4 is an application to sequential hypo- 
thesis testing for the two sample problem where the alternative is of 
the form proposed by Lehmann in [l]. The signed sequential ranking 
scheme is defined in Section 5 and a condition on the distribution of 
the sequence of observations is given which implies that the signed 
sequential ranks are independent* Distribution theory is given for the 
signed sequential ranks. Section 6 contains an application of signed 
sequential ranking to a process control problem* 

2 * Preliminary results * Let X , * . . y X^ be any random 

variables with continuous comulative distribution functions F^, 

F^, . , F . Define X to be the k^ h smallest in the set 

2" 9 n nk 

{Xi, X^, . .. , X n ). We can obtain a general expression for the distri- 
bution of X , as follows: 



( 2 . 1 ) 




n 

]T P(i X's are < x and n-i X's are > x) 
i=k 



Letting E^ denote the event [i X's are < x and n-i X's are > x] 
there are (^) ways to select the X's which are less than or equal to 
x , and a typical way in which E^ could occur is 




< x, x < X . . 

-W 



, X < X ] 



where j = 1, 2, . .. , (|?) to take into account all possible cases. 
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For j ^ j f the events E.. and E.., are disjoint and E. = UE... 

i J i J i j ij 

Thus we have 




and further, when the X_^ are assumed to be independent we obtain 

i n 

p(e ) = TT p(x < x) [j (i - p(x <x)) . 

J m .. J m 

m=l m=i+l 

As a special case of (2*1), to be used later, we have the following 
result when the X’s are distributed according to only two different 
distributions « 

Lemma 2.1. Let X^, X^, . . . , be independent random variables 

where (X , 1 < i < m) are distributed according to F(x) and 
{X^, m + 1 < i < N} are distributed according to G(x). Then 

N i 

(2.2) F_(x) - l l (“)("-") F^x) (1-F(x))"- J 

m i=k j=o J J 

G 1- ^ (x) (l-G(x) ) N-m-1+ j 



Proof: Each of the basic events E^ (defined above) can be 

written as a union of disjoint events E . , j = 0, 1, 2, ^ i where 

J 

E consists of j X's (with distribution F(x)) < x and i - j X's 

ij 

(with distribution G(x)) < x, the remaining X's are > x. There are 

( m ) (N -m ) vays to select such an event, each having probability 
j i-j 

FJ(x) (l-F(x)) m_ ^ G 1- ^ (x) (l-G(x) ) N-m-1+ 'j. We use the convention that 



(?) =0 if a < b. 
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Remark: When G = F we can use the fact that 

to get the known result 




(2.3) 



V x) 



N 

t (!?) F 1 (x)(l-F(x)) I,_1 . 



In order to derive the distribution theory associated with the 
sequential ranking procedures proposed in this paper the next lemma 
will be useful . We consider a random variable X with a continuous 
distribution function F(x) and define the sign of X to be 1 if 
X > 0 and -1 if X < 0* Letting E = sign of X, we can compute the 
joint distribution function for E and |x| as 



(2.4) F(x,y) 



0 



| F(0) 



0 

F(-y) 



F(y) - F(-y) 



00 < y < o, 

oo < y < oo, 

0 < y < oo, 

0 < y < °°, 



00 < X < 00 

00 < X < - 1 
-1 < X < 1 
1 < X < oo 



where F(x,y) = P(E < x, |x| < y) , 



since for -oo<y<0, - oo<x<oo, |x|>0 with probability 1 implies 
F(x,y) = 0, for -oo<y<co, - eo<x<-l, E = + 1 with probability 
1 implies F(x,y) = 0, for 0 < y < oo, - 1 < x < 1, F(x,y) 

= P(-y < X < 0) = F(o) - F(-y) and for 0<y<°o^ 1 < x < <», 

F(x,y) = P(-y < X < y) = F(y) - F(-y). 

In developing the properties of the signed sequential rank an 
important role will be played by the dependency of the sign of X and 
| X | and thus we establish a condition whereby E and |x| are 
independent random variables in 
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Lemma 2.2 |x| and sign of x (= E) are independent if and only if 
F(-x) = F(0) [l - F(x) + F(-x)] for all x > 0. 



Proof: The marginal distribution for E and 



Jx| are 



P(E < x) = < 



0 x < - 1 

F(0) -1 < x < 1 and P(|x|<y) 

1 1 < x 



= < 



0 y < 0 

F(y) - F(-y) 0 < y 



and the product of the marginals is 



P(E < x) P( | X | < y) 



0 



0 

F(0)[F(y) - F( -y) ] 
F(y) - F(-y) 



- oo < y < o, 

- 00 < y < oo, 

0 < y < oo, 

0 < y < oo, 



- °0 < X < oo 

- OO < X < - 1 

- 1 < X < 1 

1 < X < 00 



k 

Thus the joint distribution function of E and |x| will factor 
into the product of the marginal distributions if and only if 
F(0) - F(-y) = F(0) [F(y) - F(-y)] for all 0 < y which is equivalent 
to the condition in the lemma. 



Remark: Throughout, we will assume that the basic random variables, 

usually denoted by X or Y, are defined on the same probability space 
and have continuous cumulative distribution functions. Thus the ranking 
procedures to be defined will always be determined uniquely except 
possibly for sets of measure zero. 
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3° The Sequential Rank. In the introduction we mentioned the 



possibility of ranking observations as they are taken without reranking 
the previous observations. We make this idea formal by 



Definition 3-1 The sequential rank of relative to X^, 






the k 



th 



, X is k if X = X , k = 1, 2, , n where X 

7 n nk rr 7 7 7 

smallest in the set {X_, X_. ... , X ]. 

^ 1 2 7 7 n 



nk 



is 



Thus the sequential rank of X^ is always 1, the sequential rank 
of X^ is 1 or 2 according as X^ < X^ or X^ < X^, the sequential 
rank of X^ is 1 ; 2 or 3 according as X^ is the smallest, next largest 
or largest of the set {X^, X^, X^}, etc. We use the notation Z_^ 
for the sequential rank of X . 

Lemma 3«1 There is a one to one correspondence between the set 

of nl possible orderings X. < X. < ... < X. and the nl 

1 1 1 2 1 n 

possible sequential rank vectors (Z^, ... , Z^). 

Proof: We can consider (X ... , X^) = (x^, x^, ... , x^) 

where the x. are n distinct real numbers and the set {(x. , x. ,..., 

1 X 1 X 2 
x. )} consisting of the ni vectors obtained by permuting the coor- 
n 

dinates of (x , x , ... , x ). The corresponding set { (X , X , ... , 
-L cL n 

X. )} gives the nl possible orderings. Now define the mapping cp 
n 

from the set {(x. , x. , ... , x^ )} into the set {(r^ r^, ... , r n ): 
1 1 X 2 n th 

r = 1, r 2 = 1, 2, . .. , r n = 1, 2, . . . , n} by setting the j 

coordinate of cp(x. , x. , ... , x. ) equal to the rank of x in the 

1 1 Z 2 1 n j 

x_. i.e. the j’k* 1 coordinate is r if x. is the 



set x . 

i 

th 



x. , 



l . 

J 



smallest among x. , x io , ... , x,- . . The mapping cp is one-to-one 
1 1 2 0 

and onto. (This is almost identical to part of the proof of Theorem 1.1 
in [2 ] page 993- ) . 

Ik 



By this lemma we mean that if we consider each ordering, say 

X. <X. < ... < X. of a set of observations {X , X p , ... , X } 

i n i _L <— n 

12 n 

and use definition 3.1 to obtain the associated sequential rank vector 

(Z^, > Z ), the sequential rank vector is uniquely determined 

and moreover the sequential rank vector uniquely determines the ordering. 

Since a particular ordering of X , X^, ... , X n also determines 

the ordinary rank vector (T^, T^, ... , T ) in a one-to-one manner, 

there exists a one-to-one mapping between the set of sequential rank 

vectors and the set of ordinary rank vectors. 

In order to obtain the probability distribution for sequential 

rank vectors notice that since a particular ordering X. < X. < ... < 

X 1 X 2 

X. determines in a one-to-one manner an ordinary rank vector and a 
n 

sequential rank vector, it is enough to determine a mapping from the 
ordinary rank vector determined by the ordering, to the sequential rank 
vector determined by the same ordering. The distribution of (Z^, Z^, 

. . • , Z ) is then available for a wide class of distributions of the 
basic variables X^, X^, . . . , X^ since Hoeffding has given the distri- 
bution of (T^, T , ... , T n ) in [3]* 

Consider the indicator function 



and for X^, X^, 



X (x,y) 



1 if x < y 

< 



0 if x > y 
define the mapping 
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9(x x , x 2 . 



) 



x n ) =U, L x (x , \)> ••• > I x (x., V ' 



j=i 



j=i 



n 



, X X ( x ,, X ) 
j=l J 



The i coordinate £X(X., X.) is equal to the number of X's in 

j=i J 1 

{Xf, X 2 , .o. , X } which are less than or equal to X , that is, the 

sequential rank of X.. But since X. < X. iff T. < T. (i ^ j) we 

i i J i J d 

have 



X(x., x.) = X(T. , T.) , 
l' J l J 



and this holds for all i and j . Hence we have 



(3.1) q)(X 1 , X 2 , ... , X n ) = <p(T x , T 2 , ... , T n ) = (Z ± , Z g , ... , Zj , 



and cp is a mapping from the ordinary rank vectors to the sequential 
rank vectors corresponding to a particular ordering of the basic 
variables . 

Let f . i = 1 ; 2 ; . .. , n be continuous, non-decreasing functions 

defined on the unit interval such that f^(0) =1 - f^(l) =0 for each 

i. Denote by ^(f^, f^, ... , f^) the family of all (P , F , ... , F ) 

such that F^ = f^(F) where F runs through all continuous distributions. 

Now if X^, X^, . . . , X^ are independent and distributed according to 

F„, F_, ... , F , Lehmann has shown in [l] that 
1 2 n 

(a) the distribution of the ordinary ranks T^, T^, ... , 

obtained from X , X^, .... X is constant within each family 
l 7 2 7 7 n 



1 6 



' (f^, fg* • •• , f ) . This is lemma 3-2. 

(b) the power of any rank test depends only on f^, f p , ... , f , 

and that uniformly most powerful tests exist. This is Theorem 3*1* 

Because of the one-to one correspondence between rank vectors and 
sequential rank vectors properties (a) and (b) are preserved for sequen- 
tial ranks. The reason for this is that in computing sequential rank 
vectors we are merely identifying different points in n - dimensional 

space with each possible ordering X. < X. < ... < X. than when 

i n i 

12 n 

ordinary rank vectors are computed. Thus the probability associated 
with any subset of ordinary rank vectors can also be associated with 
a unique subset of sequential rank vectors and we have, analogously as 
in [1], 

Theorem 3-1* Given n functions f°, f°, ... , f° and any 
sequential rank test of the hypothesis H: (F^, F 0 , ... , F ) € 

(f°, f°, ... , f°) (i.e. a test based on the sequential ranks), the 

power of this test depends only on f°, f°, ... , f°* That is, if 

(F_ , F n , ... , F ) and (F* , F’, ... , F' ) belong to the same class 

K \ 7 2 7 7 n 1 2 7 7 n 

V' (f^, f^, ... , f n ) the test has the same power against these two 

alternatives. Furthermore given any class of alternatives K: 

(F , F 0 , ... , F^) e *) r (f|, fg, ... , f^) there exists a uniformly 
most powerful test based on the sequential ranks for testing H 
against K. 

When X^, X^, . . . , X^ are independent and identically distributed 

the sequential ranks are independent with distribution 

P(Z = k) = l/i k = 1, 2, ... , i i = 1, 2, , n . 
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A proof of this is given in [2] 0 We see that the mapping defined in 
(3«l) takes the vector of dependent ranks (T^, T^, ••• , T^) into the 
vector of independent sequential ranks (Z^, Z^, ... , Z^). Thus 
according to Theorem 3*1 and the discussion leading to it we lose nothing 
in the matter of hypothesis testing by considering sequential ranks 
instead of ordinary ranks, and in fact when we are dealing with inde- 
pendent and identically distributed random variables we find that the 
sequential ranks are independent. 

Since there is a one-to-one correspondence between the ordered 
observations and the sequential rank vector, the distribution theory 
for sequential rank vectors is also completely specified by 



P(X. < X. 

x 2 



< . . . < X. ) = 
— — 1 

n 




< X . <oo 

““ 1 

n 






(3-2) 






df\ (F(x. )) 

j j 



/•••/ 4 “ I - (yi - > 

0 < y < y < . . . < y . < 1 J J J 

1 2 n 

where y. = F(x. ) and the X. are assumed to he independent in this 

Z Z 

calculation. Let f = (f^, ••• > ^ n ) an< ^ wr i"ke 

P(X < X < ... < X ) = P(f). The distribution function for the nl 
vectors (Z^, Z , ... , Z^) is obtained by computing P(f) for all 
possible permutations of the components of f. In order to determine 
the marginal distribution for Z we notice that Z^ = k if only if 
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X is the k th smallest among the first i observations, and we get 



(3-3) P(Z = k) = £ P(f) f = (f , f , ... , f. ) 



J 1 J 2 



J i 



th 

where f is the k coordinate of f and the summation is taken 

over the (i-l)i permutations of the coordinates leaving f fixed at 
, _ . th 

the k coordinate . 

For the special case where the X_^ are taken to be identically 
distributed, we can take f_^(x) = x "without loss of generality, and it 
is easy to compute (3*2) and (3.3) to get 



(3A) 



P(f) = l/ n I 



and P(Z = k) = l/i k = 1, 2, 



i = 1, 2, 



n 



yielding the independence of Z^, Z^, ... , Z as noted above. 

Another special case, to be used later, is when the f . are taken 

a . 



to be the Lehmann alternatives, introduced in [l]. We let F (x) = F 1 (x) 
a_^ > 0, and in this case a straight forward computation gives 



(3-5) 




By relabeling the X's, the probability of any order of the X's can 
be found using (3*5) .» giving all the values needed in (3-2) to specify 
the distribution of the sequential rank vectors. 
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^ • An Application of Sequential Ranking to Hypothesis Testing. 



In the nonparametric, fixed sample size, two sample problem, it is 
assumed that there are available two sets of observations {X n , X^, 

... , X m } and { Y^, Y^, ... , Y^] each set from some probability 
distribution. The problem is to test the hypothesis that the distri- 
butions are the same, against the alternative that they are different. 
Usually the alternative is more restrictive as when only a shift in 
location is considered. In this section we consider the nonparametric 
two sample problem as a sequential problem rather than fixed sample 
size . 

Let X. i = 1, 2, ... and Y. j = 1, 2, ... be independent 
i 3 

random variables and assume we wish to test 

H: G = F against K: G = f(F) 

where F is the continuous cumulative distribution of the X’s and 
G the continuous cumulative distribution of the Y's. We propose to 
use the sequential probability ratio statistic based on the sequential 
ranks and we can assume the observations to be taken alternately as 



X l’ Y l’ V V • • • ' V V 



Let Z N = (Z^ Zg, ... , Z^) be the sequential rank vector based on the 
first N observations and write P^(Z^)/P q (Z^) as the sequential 
probability ratio, P referring to the alternative to the hypothesis, 



P to the hypothesis, 
o 



Under the hypothesis P(Z^ = z) = l/Nl and P (Z 1 ) = l/Ni Under 



N\ 



N \ 

the alternative we can compute P(Z = z) by noting that each outcome 



20 



vector z corresponds, in a one-to-one manner, to a particular order 



of the X’s and Y’s. For example 

z 5 = (1, 1, 1) o X 2 < Y x < X 1 , Z 3 = (1, 2 , 1 ) X 2 < X 1 < Y 1 . 

Each Z N in turn corresponds to a vector ((F, G, F) or (F, F, G) 
as in our example) of F’s and G’s meaning that the observation 

"fch 

appearing in the i smallest position in the ordering of X’s and 

Y f s has the distribution F or G according as F or G appears 
th 

as the i coordinate of the F, G vector. Thus to compute 
/ N \ 

P(Z = z) for all possible values of z we need only compute 

P(U! < U 2 < ... < U„) 

where If is an X or a Y according to the outcome. In particular 
when f is a continuous increasing function on the unit interval with 
f( 0 ) = 1 - f(l) = 0 , the probability distribution is constant for all 
continuous distributions F and depends only on f. In fact we have 

p ( u ! < u 2 < ••• < V = /••*/ n 

-» <t 1 <...<t N < ~ 1=1 



■ I -I n 

0 < y 1 < ■ • • < y N < 1 1-1 

by letting y^ = F(t^) where f^(F(t^)) = F(t^) when th = X^ and 
f i (F(t i )) = f(F(t i )) when IF = Y ± . 
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a > 0 



In the special case of Lehmann alternatives f(x) = x a , 
and by (5.5) we get, for N even. 



p i< zN > - - 



N/2 



n i a, 



i=l 



j=l J 



where A . = 



1 if U. = X. 



i " 



a if U. 

i 



Y. 

l 



and the probability ratio reduces to 



P 1 (z w ) 



n: a w / 2 



N / i 

II ( X A, 



i=l 



j=l 



A similar result holds for N odd. The vector A^ = (A^, A^, ... , A^), 

/ N \ 

corresponding to the vector of F's and G's determines P(Z = z) 
for [(n/ 2)I]"~ outcomes z out of the Ni possible. We can compute 
the probability ratios at each stage using the following relations: 



(4,l) S N P 

o 



Ni a 



N-l 

2 



N / i 

n(^ A ; 

i=l J 



NI a' 



N/2 



N / i 
i=l J 



N odd 



N even 
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(4.2) 



S N+1 P 

o 



(n+i); 



N+l 

2 



Z-l/ i \ N / i 
i=l J i=Z-l 



j=l j 



(n+i); 



N/2 



l 



Z-l/ i \ N / i 

n ( i/j) n (i ♦ 

i=l J i=Z-l 



N odd 



N even 



where Z = sequential rank of , N odd, and Z = sequential rank 



of X. 



N+2 ' 
2 



N even. At the N + l' 



st 



st 



observation Z is determined 

st 



observation came between the k - 1" 



and if Z = k, the N+l 

and the k^ smallest observations of the preceding N observations. 

Thus \ +1 = ( A ± > a 2 > ••• > A *> \> ••• > where A* = 1 if 

, st 



the N+l 



observation is an X and A* = a if the observation is 



a Y. Using (4.1) and (4.2) and Z we can pass from S to as 

the observations are taken. For example = 1 



S 2 " 



2a/l+a if ++ = (l, a) 



2/l+a if Y < ++ A^ = ( a , l) 
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6a 

(1+a) (2+a) 



if 



X 1 < Y 1 < X 2 



X 2 < Y 1 < X 1 



** Aj = (l, a, 1) 



H ' 



(1+a) (2+a) 



if 



Y x < X ± < X 2 



Y 1 < X 2 < X 1 



^ A 5 = ( a , 1, 1) 



3a 

(2+a) 



if 



\ < X 2 < Y ± 



X 2 < X x < Y 1 « A = (1, 1, a) 



We noted before that under the hypothesis a = 1 the sequential 
ranks Z^, Z^, . . . , Z are independent. However, when a j= 1 we do 
not have this independence property. Consider the case N = 3 where 
we observe X^ in that order. The possible outcomes are 



Ordered observations 



X 1 < x 2 < 



X 2 <X 1< \ 



X 1 <Y 1 < X 2 



x 2 < \ < h 



Y 1 < X l.< X 2 



Y 1 < X 2 < \ 



Sequential ranks 


Probability 


a, 2, 


2) 


a/2 (2+a) 


(1, 2, 


1) 


a/2 (2+a) 


(1, 2, 


3) 


a/ (l+a) (2+a) 


(1, 1, 


l) 


a/ (l+a) (2+a) 


(1, 1, 


3) 


1 / ( 1+a ) ( 2 +a ) 


(1, 1, 


2) 


l/ (l+a) (2+a) 



and the marginal distributions are easily computed as 



P(Z X = 1) = 1 P(z ? = 1) = a(3+a)/2(l+a)(2+a) 

P(Z 2 = 1) = l/l+a P(Z^ =2) = (2+a+a 2 )/2(l+a)(2+a) 

P(Z 2 = 2) = a/l+a P(Z ? = 5) = l/2+a 

Now P((Z X , Z 2 , Z^) = (1, 1, 1) ) = a/(l+a)(2+a) and ?(Z ± = 1) 

p (z 2 = i) p (z ? = i) - 7l+a) ( 2+a ) gfi^T and follows that 

^1' ^2^ ^3 cannot be independent unless a - 1 since independence 

of Z^ } implies (3+a)/2(l+a) = 1 which in turn implies a = 1. 

Thus we have 



Theorem 4*1 Let ... y be independent random 

variables with X. distributed according to F and Y. distributed 

l l 

Si 

according to F , a > 0. The sequential ranks based on such a sequence 
are independent if and only if a = 1. 

As an illustration of the sequential probability ratio test based 
on the sequential ranks consider the data given below. 



X, 



X, 



X, 



X, 



X„ 



X 



8 



X, 



= 5.926 


x io = 4 *° 8 


Y = 4.70 


Y 10 


= 1.56 


= 3.45 


X 1X = 3.67 


y 2 = 4.15 


Y 11 


= 4.29 


= 2.00 


x 12 = 2.94 


y 5 = 4.55 


Y 12 


= 1.74 


= 2.28 


x 13 = 5.90 


y 4 = 3.31 


Y 13 


= 2.17 


= 3-494 


X l4 = 2.18 


y 5 = 2.13 


Y 14 


= 1.97 


= 4.25 


x = 5.39 


Y^ = 4.686 


Y 15 


= 4.689 


= 2.382 


x i6 =2.74 


y ? = 2.68 


y i6 


= 2.87 


= 3-02 


X =3.492 


Y 8 = 2.36 


Y 17 


= 3.17 


= 3.26 


X l8 = 2 ’ 70 


y 9 - 3.93 
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The data is taken from Table 600 A page 600 of " Statistics, A New 
Approach/ 1 W. A. Wallis and H. V. Roberts, The Free Press, Glencoe, 
Illinois. If ve assume X has some continuous distribution F and Y 
has F 3, as a distribution then 

r oo 

F(y) dF a (y) = . 

U -00 

Suppose we consider a = 4, P(X < Y) = .8 as the alternative to the 
hypothesis a = 1. We take as "boundaries for the sequential probability 
ratio test 



= 1 - p = 1 - .05 

a .05 
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= P = -05 

l-o. l - .05 



.0526 



and if < B we accept H: a = 1, if > A we accept K: a = 4 

and if B < < A we take another observation and compute 

repeating the test. Using the computational formulas (4.1) and (4.2) 
we get 




= 1 


s n 


= 


.454 


= 1.6 


S 12 


= 


.725 


= 2.0 


S 13 


= 


• 764 


= 3-2 


S 14 


= 


VJ] 

CD 

ON 


= 4.15 


S 15 


= 


.467 


= 6.65 


S l6 


= 


.234 


= 8.75 


S 17 


= 


.168 


= 4.0 


S l8 


= 


.242 


= 3.31 


S 19 


= 


.138 


On 

0 

°0 

11 


S 20 


= 


.0288 
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and since S^q < .0526 we accept H at the 20^ observation. 

Notice that even though the probability ratio is written as 

a function of the sequential ranks , in (4.l) and (4.2), it can also be 
computed as a function of the order configuration. By this we mean, 
for example, the order configuration 1 a 1 stands for 

or X^ C Yf < X and all stands for C X < X^ or Y 1 < X^ < 

Each order configuration determines a value of as a function of a. 

It can happen that for some value of a f 1 and two different configu- 
rations, takes on the same value. As an example consider N = 6 

and the configurations a 1 1 1 a a and 1 a a a 1 1. The denominators 
in Sg for these configurations are 

g 1 (a) = a(a + l)(a + 2)(a + 3)(2a + 3)(3a + 3) 

g 2 (a) = l(l + a) (l + 2a) (l + 3a) (2 + 3a)(3 + 3a) 

respectively. For a = l/2 and a = 2 we get 

g 1 (l/2) = g 2 (l/2) = 945/8 and g 1 (2) = gg(2) = 7560. 

Let c(t) be the number of different configurations such that 
S = t . We have 




where the a . ' s correspond to any particular configuration making 
J 

S =t(a =1 or a according as X or Y is in the j place). 
N j 
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(4.3) follows "because any two configurations which make = t have 
the same probability under the alternative to the hypothesis. Under the 
hypothesis 



(4.4) 



P(S K = t 



= 1) = 



l§ 11 



N 



- tf) 



:(t) 



Ni 



In (4.3) and (4.4) [x] is the greatest integer function. 

In Wald's sequential probability ratio test the approximations 
1 P p 

A < — — — and B > a are valid when the probability of termination 
of the test is 1. These inequalities were derived under the assumption 
that the basic sequence of probability ratios was determined from an 
independent sequence of observations and that the sequential probability 
ratio at the n^* 1 observation is formed as a product of independent and 
identically distributed random variables . Under the alternative hypo- 
thesis we have found that the sequential ranks are not independent. 

Thus we must now show that the test terminates with probability 1 in 
order to interpret & and (3 as error probabilities . 

It is enough to show that the test terminates with probability 1 
considering only N even. For N even we can write 



(4.5) 



- 1 . fl a ’ 1 / 2 i 



N 



v £ A. 
i A j 



i=l 



j=l 



and define A N = k £ A. with = a' 1 / 2 A N and Z N = log . 
1 1 j=l J 



consider first the case where the null hypothesis is true. A^, 
A^, ... , Ajj. are dependent random variables with 



We 
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(4.6) 



P(A. = 1) = P( A . = a) = 1/2 
J J 



giving E(A ) = E(A^) = - * - . For i ^ j we have 



1 N-2 



P(A. - 1, A - 1) - P(A. = a, A = a) = i §5 



( 4 - 7 ) 



P( A. = 1, A. = a) = P(A. = a, A. = 1 ) = i 
1 J 1 ' ,1 ' 4 



1 N 



N-l 



and a simple computation gives Var(A.) = ) and Cov(A., A.) 

J 1 J 



- fe ( ^ ) ■ Al- E(^) - a" 1 / 2 *(J*) - a - ' l/2 * 



and 



Var(Y^) = Var(A^) = ^— { £ Var(A ) + 2 ^ £ Cov(A Aj 

1 i 1 1 1 j=l J j=l k=j+l J * 



a' 1 /. ,l-a A 2 . ,.2 -1 /l-a x 2 l 

,i ( — ) (1 - 1) 53 (— ) j 



= 1 / 1-a \ _1_ N^i 
a K 2 ’ N-l i 



and notice that Var(Y^) is decreasing in i as i = 1, 2, . . . , N. 



If 1 < a then 1 < A N < a and a" 1 / 2 < < a 1//2 . If a < 1 then 

a 1 / 2 < Y* < a' 1 / 2 . 

~ 1 — 

In order to show that the test terminates with probability one it 
is enough to show that S^“ -» oo in probability. Thus for arbitrary 
positive B we show that 
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lim P(S' 1 < l/B) = lim P(log S" 1 < log l/B) = 0 . 
N -» co N -» oo 1N 



Let K = log l/B and use Chebyshev’s inequality to get 



(l = Z - I E(zJ)<K-^E(zJ) 

V i=l 1 ' W=1 1 i=l i=l 



< p 0 fzj - ZE(zJ) I > - K + l E(zJ) 

\ i=l i=l i=l 



N 



by taking N large enough to make K - ^ E(Z^) < 0. This can be done 
since is bounded, and bounded away from 0, we have 



^ x 

Z N = log Y? = log X + 1 a 






l 1 w "a ’ A, N "i a' V , 



a 



h 



where X = £(/) , H 



a ' l 
and further 



N 

^ >1 and is bounded away from 0, 



\ ^ ^ N- i 

E(z i> 2 log \ - c TTiTIT 



c > 0 



Thus 



£ E(Z N ) = N log X - 0(log N) = 0(N) > 0 . 
i=l 1 a 





0 
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and 




< 




+ 2 



I (Var(Z^) • Var(Z^)) l/2 . 
i < j J 



Now, 



expanding 

Var(Z^) 



log 

= E(log 



in only two terms 

- E(log Y^)) 2 < E(log Y^ 



log X ) 2 
a 



= E 



Y N - X - 2 
i a 






(A - 



< c' Var( Y7 ) = c" -7 



.. N-i 



(N-D 



N-i 



and is decreasing in i. Now we can write 



N 



N 



V„ ^ E Z” 



N 



< O(log N) + 2 c Y, (N-i) Var(Z N ) 

i=l 1 



= 0(N log N) 



and finally 



p ( I z " 5 k 



0(N log N) 



< ° ~ v -> 0 as N -* 00 

o(ir) 



Since log S _1 = £ Z N we have S" 1 -> °° in probability, and when the 

i=l 1 

null hypothesis is true, the test terminates with probability 1. 
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More generally now consider 



(1..8) P N = PfA' 1 < S' 1 < B- 1 ) = P( K;l - m b < log S' 1 - M n < It, - 

where = log A \ K_, = log B 1 and m- n = E(log S^ 1 )- For large 

enough values of N 



(4.9) 



P( I log S. 



-1 

N 



X - " K 2^ - 



Var(log S^ 1 ) 

X - K 2 )2 



P N< < 



F( I log s; 1 - M-.J > K - nJ < 



N 



Var ( log S^ 1 ) 

N' - “l HT - ~77 r 72~ 

(K 1 ' V 



if 




oo 



if 



U X — oo 



The test will terminate with probability 1 as long as P -» 0, 
and this is independent of the true distribution of the Y population 
since the inequalities in (4 .9) were obtained without reference to the 
distribution of the Y ! s. In particular we found that when the X and 
Y populations are identically distributed, = 0(N) and 

Var ( log S^ 1 ) = 0(N log N) . 

The method just given to show that the probability of termination 
of the test is one is not satisfactory for all alternatives since the 
verification of condition (4.9) is difficult. We now consider a better 
approach. As before take N even and write the probability ratio as 



( 4 . 10 ) 





i=l 
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In order to show that the probability of termination is one it is 
enough to show that N 1 log converges to some non- zero constant 
since for fixed boundaries A, B, the equivalent formulation 



| log A -1 < | log T N < | log B" 1 

will terminate with probability one, provided N ^ log converges in 
probability to some non- zero constant. 

Let 2n = N and let Z , Z , . . . be the order statistics for 

the combined sample. Define the empirical cumulative distribution 
functions for the X’s and Y's as 

rn / I \ (number of X's < t) 



G 

n 



(t) 



(number of Y's < t] 
n 



i 



Since £ A. 

j=l J 

Y's in Z , 



= (number of X's in Z , Z^, 
Z , ... , we can write 



Z.) + a( number of 
1 



\ l A. 

1 A J 1 



F (Z.) + " a 
n i l 



G (Z.) 
n i 



and 



N l0g T N 



= - 2 lo S 



- log 2 + log N - log N'. 




N 



£ log (F (Z ) + a G (Z )). 

i — 1 n i n i 
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Since lim (log N - ^ log NI) = 1, we have 
N -* oo iN 

-1 1 N 
lim N log T k = log e/2 -/a + lim ^ Y lo S ( F n ( Z j.) + a G n ^ Z i^ 

N -> oo N -» oo i=l 

= log e/2-/a + i f log (F(x) + a G(x)) (GF(x) + dG(x)), 

d '-Loo 

the latter limit following from a result of I. R. Savage and 

J . Sethuraman communicated to the author hy Sethuraman as 

Theorem (Savage -Sethuraman) Let X , X ... X , Y, , Y_, ... Y 

1 d n i 2 n 

be independent random variables where the are distributed according 

to the continuous distribution F and the according to the contin- 
uous distribution G. Let Z^, Z^, <>.. , Z^ (N = 2n) be the order 

statistics of the combined sample and let F^ and G^ be the empirical 
cumulative distribution functions of the X f s and Y's respectively. 
Then 

_i w i r°° 

N x Y log ( F n^ Z i^ + a G n^ Z i^ 2 lo §( F ( x ) + a G(x))(dF(x) + dG(x)) 

i=l "-00 

in probability . (see [10]) 

In our case G = F or F 9- depending upon which hypothesis holds* 

However we will consider the entire class of alternatives , b > 0 

which could hold* Let N log T -> L (b)« Then 

im a 



3 ^ 



Ljb) = log e/2^ * i / log(F + aF b ) a 



p oo 

+ k / log(F + aF 13 ) d((l - l/a)F) 



(4.12) 



' 1+a 



= log e/2Va + —^ / log t dt + 



Sr/ l08 < 



t + at )dt 



= - log 2 - | log a + ~ log ( 1+a) + —■ log(l+at b-1 )dt 



0 



The function J log ( 1+a t"^ ^")dt decreases as b increases, and thus 
0 

L (b) is monotone in b, decreasing when 1 < a, and increasing when 
a 

a < 1. 

Under the null hypothesis b = 1 and 



- 1/2 1/2 

L (1) = log >0 for a ^ 1 

a <d. 



Under the alternative hypothesis b = a and 



L a (a) 



- 1/2 ^ 1/2 



log 



+ a 




1 



a + t 



1-a 



dt . 



In order to show that the test terminates with probability 1 we must 

have L (a) ^ 0 for a / 1. In fact we will show that L (a) <0 
a a 

for a f 1. Notice first that it is enough to consider 0 < a < 1 
since 
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L 1 ( l/ a ) = log 



= log 



= log 



= log 



1/2 

a ' + i 


f 1 / 2 


(a' 1 - 


2 




2 


a' 1 / 2 + 


a 1 / 2 


(a-1) 2 


2 




2a 2 


a' 1 / 2 + 


a 1 / 2 


(a-1) 2 


2 




2 a 2 


a' 1 / 2 + 


a 1 / 2 


(a-1) 2 


2 




2 



0 a + t‘ 



1 * t 1 - 1 /' 



dt 



2 pi 



TT/l 



dt 



as / a \ 

ds (t = s ) 



0 1+a s 



a-1 



0 a+s 



V ds = L (a) 

1-a a 



We can write 



2L^(a) = logLL + 



i-t ) . (a -n 2 



0 a+t 



1-a 



dt 



r 1 -ml at - f 1 ^ 4 - « 

0 4a + (a-1) t J 0 a+t X a 



= (a-1)' 




0 ^ 4a + (a-l)^ t a+t 1 a 



dt 



and we wish to show that a+t^" a < 4a + (a-l)^ t for 0 < t < 1 and 



0 < a < 1. Define 



h(a, t) = 3a + (a-l)^t -t^" a 



and notice that 
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Tj| = (a-1) 2 - (l-a) t' a < (a-l) 2 - (l-a) = (a-l) a < 0 



1| = a(l-a) f (a+l) > 0 . 

at 2 

Since h(a, 0) = 3a, h(a, l) = a(a + l) we may conclude that 

h(a, t) > 0, which makes the integrand in 2L (a) negative as was to 

a 

he proved. 

We have shown that the sequential test terminates with probability 
one under the null and alternative hypothesis and moreover the test will 
terminate with probability one when the Y’s are distributed according 

■j^ 

to F for b > 0 except possibly for only one value of b. This 

follows from the monotonicity of L (b). 

a 

We also remark here that for a fixed sample size test of 

~H Q i X ~ F, Y ~ F 

against X ~ F, Y ~ F a a ^ 1, a>0 

using ranks of observations, the Neyman-Pearson theory would give a 
most powerful test of the form 

accept f° r > K . 

An equivalent test would be to accept if ^ log > -j log K. 

Assume a > 1 and let L (b ) = 0. Then 

a 0 
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lim P(| log S' 1 > | log K) = 1 if Y ~ F° b < b 

N °o 0 

lira P(| log S' 1 < | log K) = 1 if Y ~ F° b > b 

N — > 00 0 

and thus for a test of the composite hypotheses 



X 

o 


X ~ F, Y ~ F b 


0 < b < b (a) 
0 


against 


X ~ F, Y ~ F b 


8 0 (a) < b 



the test is consistent . 
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5* The Signed Sequential Rank . We now extend the ranking procedure 



defined in Section 3 to include the sign of the observation. This corre- 
sponds to the signed rank statistic used in fixed sample size problems. 

Definition 5*1 The signed sequential rank of relative to 

X x , ° 0 * > -*- s the P r °duct of the sequential rank of Ix^J relative 

to |XjJ, (x^j, ••• 9 |x^| and - sign (^ )> where sign (X^) =1 if 

X >0 and sign (X ) = -1 if X <0. 
n - n n 

In the case of sequential rank vectors there are NI points in the 

sample space corresponding to a sample X^, X^, ... , X^ and in the 

N 

case of signed sequential rank vectors there are 2 Ni points corre- 
sponding to the same sample. Of course if the basic variables (the X_^) 
are positive random variables (or negative) the signed sequential ranks 
are equivalent to the sequential ranks. 

We found in Section 3 that when the basic random variables are 
independent and identically distributed the sequential ranks are inde- 
pendent . This result does not hold in general for signed sequential 
ranks and so we now determine a sufficient condition for this result to 
hold in this case. 

Let X^ ? X ^ 9 . . . , X^ be independent and identically distributed 
random variables and let Z ^ = sequential rank of |x_J relative to 
\\\, |x 2 |, ... , |x.|, E. = sign (X.) with Y. = E. Z ± , i = 1, 

2 , ... , N. If F(x) = P(X^ < x) satisfies the condition in lemma 2.2. 

E , |XjJ, |X 2 1, ... , |X | are independent and it follows that E_^ 
and are independent. Thus we get 
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(5-1) 



P(Y. = j) = P(E. = 1, z. = j) = P(E. = 1) P(Z. = j) 

= (1-F(0)) l/i 

P(Y. = -j) = P(E. = -1, Z. = j) = P(E. = -1) P(Z. = j) 

= F(O) l/i 

for j = 1, 2, ... , 1, i = 1, 2, ... , N. P(Z_^ = j) = l/i follows 
from (3.4) 

We will now show that the condition given in lemma 2.2 is a 
sufficient condition to guarantee the independence of the signed 
sequential ranks . 

Theorem 5.1 . If X^, . . . , are independent and identically 

distributed according to F(x) where F(-x) = F(o)[l-F(x) + F(-x)] for 
all x > 0 then the signed sequential ranks Y^, ... , Y^ are 

independent random variables . 

Proof: Let (i^, ••• , i^) be an arbitrary outcome vector for 

(Y^, Y^, ... , Y^) and let k be the number of positive integers in 

(i L , i 2 , ... , i N ). We have 

TT p/ y - < ) - [1-F(0)] k [F(Q)f' k 

11 K m " V ~ Ni 

m=l 

from (5.1). Each outcome vector corresponds to a particular ordering of 

the X's, with N-k of the X's negative. The absolute values of these 

N-k X's have a particular ordering among the positive X's. So each 

outcome vector is equivalent to an event like [0 < e X. < e X. < ... < 

J- Jn 

< e„ X. 1 where k of the €. are 1 and N-k are -1. The distri- 

N J 

hution function for -X is l-F(-x) and using F(-x) = F(0)[l-F(x) + F(-x) ] 
we have 



40 



Hence 



d{l-F(-x)} = -dF(-x) = dF(x) . 



P(Yi i x , Y 2 i 2 , ... , Y n - i N ) 



r r N 

= P(o< ei x. i <...< eN x. N ) = J ... J jj. 



0 < y ± < • • • < y N < oo 



dF x. 

J i 



F(0) 

1-F(0) 



N-k 




0 < y 1 < 



< y. 



N 



< 00 





-m ' 

i-f(o) 



N-k 



P(0 < X. < . . . < X. ) 



- J 



J N 



F(0) 

l-F(O) 



N-k P(0 < X. , for all i ) y [F(o)] N ~ k [l-F(p)] k 

n: ni 



N 

Thus P^ = i x , Y 2 = i 2 , , Y N = i N ) = J^P(Yj = ) establishing 



the independence. 

Remark: In the proof of the theorem we have assumed that F(o) / 1. 

If F(o) = 1, the are negative random variables and the signed 

sequential ranks reduce to {-(sequential rank of |x^|)}, which are 
independent . 

The condition F(-x) = F(o)[l-F(x) + F(-x)] for all x > 0 is 
satisfied by distributions of positive, negative and symmetric (about 0) 
random variables. A larger class of distributions satisfies the 
condition. If we consider all measurable sets A c [0, °o) and define 
-A = {x: - x e A), then the condition 

Pr{X e A] = k Pr{X e - A) k > 0, all A 

is enough to insure that F(-x) = F(0)[l-F(x) + F(-x)] for all x > 0, 
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since taking A = [0, <») we get k = and taking A = [0, x] we get 

F(-x) = F(o)[l-F(x) + F(-x)] for all x > 0. On the other hand, 
starting with F(-x) = F(0)[l-F(x) + F(-x)] for all x > 0 we get 

dF(x) = 1 ~[^ ) d{-F(-x)} 

and 

Pr{X e A] = J dF(x) = I d{-F(-x)) = Pr{X e - A) . 

A A 

We now consider the asymptotic distributions of sums of signed 
sequential ranks based on observations from a distribution satisfying 
the condition in Theorem 5*1- Let X , X , ... , X r be independent 
identically distributed random variables with common distribution 
function F(x) such that for all x > 0 F(-x) = F(o)[l-F(x) + F(-x)] 

holds. Define = signed sequential rank of X^. Using (5*1) we 

get easily 

E(Y n ) = (1-2F(0)) 

Var(Y n ) -(± - Lz|P(0)) 2 ) n 2 + (§ - -*±=f^ ) n 

(1 _ (1-2F(0 )) 2 'N 

\6 4 J 

When F(x) satisfies the condition of Theorem 5*1 the signed 

sequential ranks are independent, but not identically distributed, and 

n 

forming the partial sums, S n = £ Y^ we have 
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E(S n ) + ^(l-2F(0))) n 

(5-3) 

Var( S ) . ( VM1-2F(0)) 2 ^ n(n+l)(2n+1) + ^(l-2F(0)) 2 \ n(n+1) 



+ 



4-6(l~2F(0 ))' 
24 



n . 



Now for e > 0, k = 1, 2, . , . y n, = Var(S^) it follows that for 

large enough values of n 



|y-E(Y. )| > e a 



(y - E(Y k )) dF Y (y) = 0 



n 



because the range of integration becomes a set with zero probability 

since Y. is bounded according to | Y. I < k and a 554 c • n . Thus 
k k - n 

as n oo the integral is zero for all k = 1, 2, . .. y n and by the 
Lindegerg-Feller Theorem it follows that is asymptotically normal. 

If we normalize the signed sequential ranks and then consider 
partial sums we get 




Y. - E(Y.) 
[Var(Y. ) ] 1//2 



and 



0 < 



Y. - E(Y.) 
[Var(Y.)] 1//2 




21 

172 

+ a 0 i + a 5 ) 



2 

T 172 

(a + - /i + a 3 /i ) 
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as i -» c» where OC^ = i - , (l ^F(O) ) ^ Hence the normalized signed 

sequential ranks are uniformly bounded and by the bounded Lyapounov 
Theorem S*/ i/n is asymptotically distributed as a unit normal random 
variable . 

As was noted in the introduction, some statistical problems are 
concerned with detecting a change in the distribution of a sequence of 
observations obtained from some process. We now consider the case where 
in the basic set of independent random variables X ... , X^} 

the first m are distributed according to F(x) and the remaining 
N-m are distributed according to G(x) . As before let Y denote 
the signed sequential rank of X^. Since each possible outcome vector 
for (Y^, Y q , ... , Y^) corresponds to an event of the form 



to <<!*! < «2 X < ... < € X ] 

1 2 K 



where €_^ = + 1 and (i^, i 2 , ... , i^) is a permutation of 

(l, 2, ... , N), the joint distribution of the signed sequential ranks 

is obtainable, in principle, from 



(5.*0 P(0 < €. X < € X <-..<€ X )= P(F, G, € ) 

1 2 N 



where e = (e , € 2 , ... , e N ) . In general (when F ^ G) the Y_^ are 
not independent. For example if we are sampling from an unknown distri- 
bution F(x) and we wish to detect a change in distribution to F a (x), 
a > 1 (a stochastically larger distribution) where F(0) = 0, we lose 
the property of independence. In this simple case signed sequential 
ranks and sequential ranks are equivalent and taking N = 3 with m = 1 
we have 
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P(Y 1 1) 1, P(Y 2 1) 1+a , P(Yj 1) 2 (l+a) (l+2a) 

and P^ = 1, Y 2 = 1, Y^ = l) = 2 (]_+o a ) • In general, for a > 1 

1 / 1 l+3a 1 l+3a 

2 (l+2a) ^ l+a 2 (l+a) (l+2a) " 2(l+2a) ( 1+a )2 

Since there are cases when the signed sequential ranks are inde- 
pendent we now determine the marginal distributions for signed sequen- 
tial ranks in the case where a change in distribution from F(x) to 
G(x) occurs for arbitrary continuous distributions F(x) and G(x) . 
Let y . . . , X^. be independent random variables with X_^ 

1 < i < m distributed as F(x) and X^m + l^i^N distributed 

as G(x) . Take N = m + n, and let Y_^ be the signed sequential 

rank of X_^ and H^(t) the distribution function of the k 

order statistic from the set (|x |, (Xgl, . .. , X p. It is enough 
to determine the distribution of Y^. Using lemma 2.1 and P(|x^| < x) 
= F(x) - F(-x) for x > 0 we get 

(5 = 5 ) 

N-l i , , . 

H.(t) = l I (”)(J'b(F(t) - F(-t)) J (1 - F(t) + F(-t)) m_J 
k i=k j=0 J 1_J 

• (G(t) - G(-t)) 1_J (l-G(t) + G( -t) ) n-1+ j _1 

t > 0 

Now let be the order statistic from {|x^|, | X^i , ... , 

I x N _i ( } . Then 
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p (Y N = 1) = P(0 < X N < Z x ) = E(G(Z 1 )) - G(0) 



Also 



Wow 



For 




P ( Y N 



for 



G(t) dH x ( t ) - G(0) 



p 00 

= 1 - G(0) - J H^t) dG(t) 



for 2 < k < N - 1 

P( y N = k) = P(Z k _ x < X N < Z^ = E(G(Z k )) - E(G(Z k l )) . 



E(G(Z )) = 
k 




G(t) diyt) 



= 1 




H^Ct) dG(t) 



and 



/ oo 

{H k-i (t) ■ V t)} dG(t) 

k-l A' ° 

= I (T)( k i J / (F(t) - F(-t)) J (l-F(t) +F(-t)) m ' J 

j=o J k ' 1_J J 0 

(G(t) - G(-t) ) k_1 " j . (l-G(t) + G(-t) ) n-k+ j dG(t) 
k = W we get P(Y k = N) = P(Z N _ 1 < J^) = 1 - E(G(Z If _ 1 )) 

) 

H W l^ dG(t). For negative values of Y^ we can calculate 

= -k) = P(Z k ^ < - < Z k ) in a similar manner to obtain finally 

2 < k < N - 1 
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P(Y = 1) = 1 - G(0) - / H (t) dG(t) 

J 0 



P(Y n = -1) = G(0) + J H^t) dG(-t) 



k-1 



T 7 ' w 

P(Y m = k) = 2 (j)( k -l-j> / (F(t) - F(-t)) J (l-F(t) + F(-t)) n - j 

j=o J 0 



(5-6) 



• (G(t) - G(-t)) k " 1-J (l-G(t) + G(-t)) n ' k+J dG(t) 



k-1 



, n-1 



P(Y n = -k) = - £ (j)( k _i_j) / (F(t) - F(-t)) J (l-F(t) + F( -t) ) 

j=0 J 0 



m-j 



(G( t) - G(-t)) k ' 1_,:i (l-G(t) + G(-t)) n " k+J dG(-t) 



P(Y n = N) = H^Ct) dG(t) 



P(Y n = -N) = - J H N _ 1 (t) dG(-t) 



The equations given in (5*6) can be written in one formula as 



k-i n 1 r°° 

P(Y n = ek) = 6 l (j)( k _i_j) / (F(t) - F(-t)) J (l-F(t) + F(-t)) m - J 

j=0 0 

(5-7) 



• (G(t) - G( -t) ) k-1_ j (l-G(t) + G(-t) ) n_k+ J dG(et) 

where e = + 1 and k = 1, 2, . . . > N. Verification that (5*7) reduces 
to (5.6) in the case k = 1 and e = + 1 can be accomplished through 
the following result 
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N i 

Lemma 5 . 1 £ I (?)(,!) p j (l-p) m ' J q 1_<j (l-q) n ' 1+J =1 (N = m+n). 

1=0 j=0 J " J 

Proof: Let a = (j)(A) p J (l-p) m_j q 1_j (l-q) n_1+ ^ and recall 

the convention of (^) =0 if b > a. Instead of summing as indicated 
we sum along diagonals and get 



N i 

I I 



i=0 j =0 



a. . 
ij 



N N-i 



I la 

£=0 j=0 






j 



N N-i . . . . 

= I I (*)(“) P J (l-p) m - J qVq)^ 
£=0 j =0 



N . . N-i 

= I (") ^(l-q)* 1 "* I (?) P J (l-p) m - J 

^=0 j =0 



= I (“) q^(l-q) n 1 Z (“) p^(l-p) m J since (^) = 0 & > n 

i=0 j =0 



Since 0 < £ < n the upper limit in the second sum is N > N - i > N 

- n = m implying m < N - & and making the second sum always equal 
to 1. Using the binomial theorem a second time gives the result. 



Letting p = P(t) - F(-t), q = G(t) - G(-t) we can write 
H^(t) = 1 - [l-p] m [l-q] n ^ to complete the verification. 

Using Lemma 5*1 and (5*7) we can compute the characteristic 
function for as 



(5.8) 



<P(u) 



E(e 



iuYi 



N 



) = 



e 1U [l-q(t) + q(t) 



e" 1U [l-q(t) + q(t) 



iu 

e 



n-1 

] 



-iu 

e 



n-1 

] 



[l-p(t) + p(t) e 1U ] m dG(t) 

m 

[l-p(t) + p(t) e“ 1U ] dG(-t) 



u 0 

where p(t) = F(t) - F(~t) and q(t) = G(t) - G(-t) . 
Differentiating (5*8) and setting u = 0 ve get 



/ oo 

(1 

0 

( 5 . 10 ) 

E(Y^) = 1 + (n-1) • 



+ (n-l) q(t) + mp(t)) d(G(t) + G(-t)) 



(2n + $) 



(3mp(t) 



+ 2m(n-l) q(t) p(t) 

2 

+ m(m-l) p (t)) dq(t) 



The marginal distribution of equation (5*7), holds for 

arbitrary continuous distribution functions F and G and thus (5-8), 
(5-9) and (5-10) are the general expressions for the characteristic 
function, mean and second moment of the Y^. Thus to generalize (5*2) 
to arbitrary continuous distributions F we let F = G in (5*9) and 
(5.10) and we get 



(5-11) E(Y W ) = 



(N-1) 



\ - O?(0) - 



G(-t) dG(t) + 1 - 2G(0) 



(5-12) 




N 2 



3 




+ 



1 

Z 
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6. An Application of Signed Sequential Ranking to Process Control . 

As stated in the introduction, in the process control problem we wish 

to determine a procedure which will determine when a given sequence of 

random variables changes from being distributed according to F(x) to 

a different distribution G(x) . In particular we will consider the case 

where F(x) satisfies the condition of Theorem 5-1 and changes to G(x) 

which also satisfies the condition. Inasmuch as the distribution of the 

signed sequential ranks depends on the parameter F(0) we will of 

course require G(o) f F(0). The procedure described in this section 

is still applicable to cases where G does not satisfy the condition 

in Theorem 5*1 but we do not have exact results in such instances. 

However empirical, results are presented at the end of this section 

bearing on the effectiveness of the procedure for special cases. 

Let X^, X^, . .. be a sequence of independent random variables 

(observations on a process) with common distribution function F(x) 

where for all x > 0 the condition F(-x) = F(o)[l-F(x) + F( -x) ] holds, 

and let Y , Y^, be the corresponding signed sequential ranks. We 

define the cumulative sums S = Z + Z + ... + Z where Z. = Y./i. 

n 1 2 n li 

Since the condition in Theorem 5*1 is satisfied the Z^ are independent and 



' 1-F(0) 





n 



(6.1) p(z = t) = 

n 



< 



F(0) 



t 



1 2 
n * ~ n * 




n 



Some easy computations yield 
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(6.2) 



E( z n ) - ±sm. . d + i) 



Var(Z ) 
n 



E(Z=) 



1 _ / l-2F(0) 



i + 1_ 

5 2n 6n 2 



; ) V * v 



E(S ) 
n 



“ ( n , 1 Vl 



Var(S ) 
n 



1-2F(0) 



FK - \£0 * 



Although tedious, the distribution P(S^ = t) can "be computed 



exactly. For example 

p(s 1 = t) = p(z 1 = t) = 



1-F(0) 

F(0) 



t = 1 
t = -1 



p(s 2 = t) = p(s 1 +z 2 = t) = < 



^ (l-F(O) ) 2 



(1-F(0)) F(0) 

2 

(1-F(0)) F(O) 

(1-F(0)) F(0) 

2 



(F(0))' 



t = — 2 

t 2 ' 



-I 



t = 0 



t = -k 



t = -|, -2 



and in general 

(6.3) P(s n = t) = P(s n _ 1 = t-z n ) = X P(S n _ 1 = t-x) p(z n = x) 

, , n-1 1 1 n-1 

where x ranges over -1, - , ... , - - , - , ••• , 1- 

The procedure we will propose will stop the process whenever S 



does not lie in some fixed open interval (b, a) where 
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-°o<b<0<a<co. In order to determine the operating characteristics 
of such a procedure such as the average number of observations until the 
process is stopped we must compute 

(6.4) P(N = n) = P(b < S < a, i = 1, 2, ... , n-1, £ (b,a)) 

N being the smallest integral value for which does not lie in the 

CO 

open interval (b, a) . Then E(N) = £ n P(N = n) gives the average 

n=l 

number of observations as a function of a, b and F(0) . In order to 
compute the probability of reaching the boundaries b and a for the 
first time at time n the following procedure may be used. We define 
P^(x) = P(S < x), F (x) = P(S 2 < b < < a) and in general 

(6-5) F (x) = P ^ S n S x y ^ < < a i . . . > n-l) 

p a 

It follows that F^(x) = P(Z^ < x-S^, b < < a) = J F^ (x-y) dF^(y) 

and in general 

(6.6) F (x) = f F z (x-y) dF n _ 1 (y) . 

^ b n 

The probability of reaching boundary a for the first time at n is 

F (°o) - F (a) and the probability of reaching boundary b for the 
n n 

first time at n is F (b) - F n (-°°). Using these probabilities we 
can also calculate E(N) . 

Computations of the probability functions in (6.6) could be 
carried out and the computational burden lessened somewhat by noting 
that for large values of n, the tend to become identically 

distributed. We now consider some approximations to E(N) using some 
results from sequential analysis. 
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Using (5*1) the characteristic function of Z^ is given by 

( 6 . 7 ) T (u) 1 -F( 0 ) e 1 u/n - e lu(ltl / n) , F( 0 ) e - 1 - e- lu ( m / n > 

n n i u/n n I -i u/n 

1 -e 1 -e ' 

and using limited expansions of exponentials we have 



(6.8) 



cp(u) 



lim cp n (u) = (1-F(0)) 
n -> oo 



1U n i 

e - 1 + F( 0 ) 1 ‘ e 



-1U 



1U 



1U 



which is the characteristic function of a random variable with density 



(6.9) 



F(0) -1 < x < 0 



f(x) = i 



1-F(0) 0 < x < 1 

i 

For large values of n, Z^ has approximately the density of (6.9). 
The moment generating function associated with (6.9) is 



( 6 . 10 ) 



M(t) = F(0) - f -i- (1-F(0)) 




1 



which exists for all real values of t. As an approximation we will use 

E(Z ) = ^ In the cumulative sums S = Z + Z + ... + Z the 

v n' 2 n 1 2 n 

Z ^ are independent and as noted, not identically distributed. However 
if we disregard the first few signed sequential ranks and start later 
in the sequence the approximation to identically distributed random 
variables improves. As before, we take N to be the smallest integral 
value for which does not lie in (b, a). We use the results of 

Wald [ 5 ] in the sequel. 
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Consider first the case where F(0) = l/2 (F is symmetric about 0) 
Here E(Z ) = 0 and using (3-8) of [5] 



E(N) 



E(s|) - 

E(Zp) E(Z 2 ) 
n n' 



When F(0) / l/2, E ( z n ) /o and we can use E(S^) = E ( z n ) ' E (N) and 
the approximation E(S n ) = aP(S N > a) + b(l-P(S N > a)) to get 



( 6 . 11 ) 



E(N) = 



/ 

-3 a b 

2b + 2 (a-b) P(S N > a) 
1-2F(0) 



F(0) = 1/2 



F(0) / 1/2 



Let h be the non zero root of M(t) = 1. A further approximation 



gives 



( 6 . 12 ) 



«S N > a) 



1 - 

ah 

e 



bh 
e 

bh 



- e 



where of course h depends on the value of F(0). Setting M(t) =1 

1 + t 

we get F(0) = — - which must be solved for t. Each 

2t - u 

- e - e 

solution corresponding to a fixed value of F(0) is a value for h in 
(6.12) yielding, in turn , a solution to (6.11). 

Let g(t) = 1 + * ■ Then g'(t) = ‘ t ( 1 - c ° sh ^ + t 

2 - e - e 4(l-cosh t) 

and considering the numerator a(t) = 4(l-cosh t) + 2t sinh t we find 

a'(t) = sinh t + 2t cosh t and moreover 
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a’ (t) < 0 



t < 0 



a* (0) = 0 

a' (t) > 0 t > 0 

Thus a(t) > 0, making g 1 (t) > 0 and g(t) is monotone increasing 

in t. As F(o) increases from 0 to 1 the solution to 
1 + t - e t 

F(o) = — — say h(F(o)) increases from to °° . Notice 

2 - e - e 

that 



-t -t 

lim g(t) = lim r — — z —^r = 1 

t t 2e - 1 - e” 



t t 2t 

lim g(t) = lim — r e p .~ = 0 . 

t -» -» t -» -» 2e - e - 1 



Now for h = h(F(0)) increasing, P(S^ > a) 
in h since for 




is decreasing 



we have 



g(h) 



1 - 

ah 



bh 

e 

bh 



e - e 



s' (h) 



, v (a+b)h r ah bh, 

(a-b) e - [ae - be J 

/ ah bh\2 
(e - e ) 



and considering the numerator after factoring out 
show 



e (a+b)h 



we have to 



a-b - ae + be ah < 0 for all h. 
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Writing a = a/a-b, P = - b/a-b we have a + P = 1, a, p > 0 and we 
must show that 1 < a e P ( a ' b ) h + p e -“(a-b)h . Let f ( h ) = a e P(a-b)h 

+ P e and notice that f(0) = 1, f'(o) = 0 with f"(h) > 0 

since 

f'(h) = ap(a-b) e P ^ a_b ^ h - ap(a-b) e - u ( a " b ) h 



f "(h) = a P 2 (a-b) 2 e b ( a - b ) h + oPp(a-b) 2 e _Q: ( a - b ) h 



Thus f(h) attains its minimum value at h = 0. For increasing values 
of F(0) the corresponding values of h = h(F(0)) increase and 
P(S^ > a) decreases. For F(o) ^ l/2 we have 



(6.13) 



E(N) 



2b + 2(a-b)(i^ 



1-2F(0) 




In particular taking b = - a, h 0 we have 



(6.14) 



, x _ 2a(l-e^ ah - sinh(ah) ) (l-cosh(h) ) 
' ' sinh(ah) (sinh(h) - h) 



For h = 0, E(N) = 3a and (6.14) is plotted in Figure 1 for 

1 + t e^ 

selected values of a. g(t) = — 1 — r is shown in Figure 2. 

2 0 ” t 

- e - e 

E(N) is plotted against F(o) in Figure 3* 



Suppose now that a process is observed according to some measurable 
characteristic and we have a sequence y ... , distributed accord- 

ing to F(x) where F(x) satisfies the condition of Theorem 5-1 and 
moreover we assume F(0) = 1/2. If we set boundaries (-a, a) a > 0 
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Figure 



and use the rule which requires us to stop the process when ^ (-a, a) 

2 

for the first time we can expect to continue for 3a observations before 
stopping. However if the process is such that F(0) / l/2 we will stop 
the process in the reduced average time as given in Figure 1. Similar 
computations can be made for arbitrary intevals (b, a) using ( 6 . 13 )- 
However, in the process control problem we wish to detect when a change 
takes place in the distribution of the basic random variables. We have 
seen that when a change takes place from F(x) to G(x) at some point 
in the sequence, the signed sequential ranks are no longer independent 
in general. Suppose the change is to a distribution G(x) such that 
the condition of Theorem 5*1 is still satisfied and the change takes 
place at time m. Intuitively, one might feel that for large values of 
n the distribution of the m + n^ h signed sequential rank would depend 
very little on F and m. This being so we could assume the sequence 
{Z^} to be independent for the purpose of determing the expected number 
of observations until the process is stopped. For example suppose we 
take (b, a) as the continuation interval and denote (6.13) by 
E(a, b, F(o)). Given that = x, b < x < a the expected number of 
additional observations under G(x) is 

(6.15) E(W l S m = x) = E(a-X ' b - x > G (°)) 

2(b-x) , 2(a-b) e Xh - e bh 

1-2G(0) 1-2 G(o) g ah g bh 

The conditional distribution for S is 



6o 



P(b < S < x) 

P(S < x|b <S < a, b < x < a) = r ," ^ — r 

m - m * ' P(b < S < a) 



(6.16) 






F g (X) - Fg (b) 

m *"m 

Fg (a) - Fg (b) 



m 



m 



0 



x > a 



b < x < a 



x < b 



and the unconditional total expected number of observations is given 

by 



( 6 . 17 ) E(N, m, G( 0 )) = m + ^ ^ ^ E(N|S m = x) dFg (x) 



m 



m 



To lend some support to the statement that for large values of n, 

the distribution of Z ( does not depend too much on m and the 

m+n 

distribution of , X^, ... .X (and thus could be taken as G(x) 

1' 3 m 

to justify (6.15)) we examine its characteristic function as n -> 00 . 

We have 

iuZ 



lim cp (u) = lim E ( e 

n — > 00 n — > 00 



n+n 



• u 

= lim / e m+n f l-q(t) + q(t) e m+n 



u 

' 00 1 



n -1 



n — > 00 u 0 



l-p(t) + p(t) e 



u 

m+n 



m 



dG(t) 



6 l 



u 



n-1 



lim j e m+n (^l-q(t) + q(t) e 



n -> oo 



0 



m 

• u 

l-p(t) + p(t) e m+n ^ dG(-t) 



n-1 

. u 

lim ( l-q(t) + q(t) e m+n ^ dG(t) 



0 n -> 



0 



-1 



lim f l-q(t) + q(t) e 



u 

m+n 



n-1 



cLG(-t) 



n — > oo 



i lq(t)u dG(t) - 4G(-t) 



Also, since q(t) 



G(t) G(0) 

i^oy - irorny 



and 



q(t) 



G( -t) 

" W 



i 



we have 



lim cp n (u) = G(0) — + (l-G(O) ) 




iu 



corresponding to (6.10). 

We now consider a case where we have a change from a distribution 
satisfying the condition in Theorem 5«1 to another such distribution. 
Imagine a production process where some dimension is measured on the 
items being produced. Let these measurements be X^, X^, ... assumed 
to be independent and identically distributed as F(x). Each item is 
subject to inspection and if X < 0 the item is removed from the 
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production line with probability p. The result is a new sequence 

say C ••• and ve call this random censoring and { C } the 

censored sequence. The distribution of C can be found by 

1 



P(C. < t) 
1 - 



7 p(c. < t|c. = x. . . ,) p(x. .. . = c.) 

A i - i i+j-i i+j-i i 



I P(x. 
. 1=1 



... . . < -tie. = X_. . ) P(X. _ = c.) 

1+J-l — 1 1+J-l 1+J-l 1 



= p(x 1 < t|x i 
= p(x 1 < t|x x 



C l) Zp(c. 

J=1 



X... J 
1+J-l 



c l ) 



For t < 0 



P(c. < t) = p(x x < t|x 1 = C 1 ) 

P(X^ < t, X^ < 0, X^ not censored) 

= p(c x = x 1 ) 



P(X < t, X^ not censored) 
l-pF(O) 



= (1-p) F(t)/l-pF(0) . 



For t > 0 
p(c < t) 



P(X^ < t, X^ < 0, X^ not censored) + P(X^ < t, 0 < X^) 

l-pF(O) 

F(t) - PF(0) 
l-pF(O) ' 
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For random censoring when 



X. < 0 

1 — 



we have 



( 6 . 9 ) 



p(c. < t) = 

1 — 



F(t) 



F(t) - pF(O) 
l-pF(O) 



t < 0 



t > 0 



In a similar way if we censor with probability p when > 0 we have 



( 6 . 10 ) 



p(c. < t) = <; 



F(t) 

1-p + pF(0) 



(1-p) F(t) + pF(0) 
1-P + pF(0) 



t < 0 



t > 0 



Suppose now that the symmetry condition F(-t) = F(o)[l-F(t) + F(-t)] 
holds for all t > 0. In the case of random censoring for X < 0 we 
have for t > 0 



G(t) = P(C. < t) 

F(t) = [l-pF(O) ] G(t) + P F(0) 

F( -t) = G(-t) 

F(0) = G(0) 



Using these relations it follows that 



G(0)[l-G(t) + G( -t) ] = F(g) 



l-F(t) + (1-p) F(-t) 
1-P f(o) 
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and from the symmetry condition on F we get 



G(0)[l-G(t) + G( -t) ] = G(-t) for all t > 0 , 

with a change from F(0) to G(0) = d • In particular for 

F(0) = l/2, G(0) = (l-p)/2-p. A similar calculation for censoring when 
> 0 shows that the symmetry condition holds for G(t) and 
G(0) = F(0)/l-p + pF(O). For F(0) = l/2, G(0) = l/2 + p. 

We shall now compare the expected number of observations needed to 
stop a process subject to random sampling using a Shewhart type control 
chart with the expected number needed using the procedure described 
above. Consider a sequence of independent observations ... 

with common continuous symmetric distribution F(x) . Subjecting the 
X^ to random censoring when 
of the censored observations 

(6.11) P(C < t) = 



We assume here that when p = 0 the process is in control and that 
when the process starts some fixed value of p, 0 < p < 1 is in effect. 
If p > 0 we want to stop the process as quickly as possible. We con- 
sider three procedures: 



X. < 0 we get from (6.9) the distribution 

Cp j Cg ) ... us 



1- P 

2- p 



2F(t) 



t < 0 



2F(t) - p 

2-p 



t > 0 



65 



procedure 1 



- when Cl > b > 0 for the first time, stop the process 

procedure 2 - when |ci | > b > 0 for the first time, stop the process 

procedure 3 - when |s^| > a > 0 for the first time, stop the process 

Procedures 1 and 2 are Shewhart type procedures and b is usually 
taken so that the probability of stopping at a particular stage is 
small when p = 0. Procedure 3 is the signed sequential rank procedure 
previously described in this section. Define p^ = P(C > b) and 
p^ = P(|c | > b) assuming p = 0. For each procedure the probability 
of falling outside the control limit for the first time at the n^ 1 
observation is 

p i (1 ' p i )n_1 1 = 1 > 2 

and E (N) = l/p^ Eg(N) = l/p 2 are the expected number of observations 

2 

taken before stopping. E^(N) = 3a and setting E^(N) = E^(N) = E^(N) 
we get 

Pl = l-F(b) = F(-b) = l/3a 2 

p 2 = l-F(b) - F(-b) = 2F(-G) = l/3a 2 . 

For p > 0 p' = P(C. > b) = 1-P(C. < b) = 2 ^' b ^ = - and 

11 i - 2-p (2 _ p) ^2 

p^ = P(C. > b) + P(C. < -b) = + ^ 2 F(-b) = 2F(-b) = l/3a 2 
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Thus for p > 0 



E 1 (N) = l/p{ = (2 ~ P ^ 3a 
E 2 (N) = 1/p^ = 3a 2 



e 5 (n) 



-2a + 4a 



ah 



- 1 



2ah . 
e - 1 



P/2 - p 



and notice that since P(C^ < 0) = ^ < l/2, it follows that h < 0. 

E^(N) and E^(N) increase quadratically with a and E^ is essentially 
linear in a. For example 



a = 10 a = 20 



p 


1 


3/4 


1/2 


1/4 


1 


3/4 


1/2 


1/4 


E 1 (N) 


150 


187.5 


225 


262.5 


600 


750 


900 


1050 


e 3 (n) 


20 


33-3 


60 


140 


40 


66.6 


120 


280 



and procedure 2 is insensitive to values of p > 0 . The values of h 
corresponding to p = 1, 3 / 4 , l/2, l /4 are , -2.2, -. 9 , -«5 
respectively. 

The following tabulated results were obtained empirically to 
determine the effect of translation of the mean of the observations. 
We considered normal observations with mean p. and variance 1 and 
stopped sampling when |s^| > a for the first time where 
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s 

n 



n 



I z - 



i=l 



n Y. 




and Y^ is the signed sequential rank of X_^, X_^ ~ ^7(p,l). For each 
parameter pair (a,p) twenty trials were performed except for |i = .1, 

.2, .3 where fifty trials were used. Sample averages, sample variances 
and sample standard deviations for termination time N are given. 

a = 10 a = 20 



M- 


N 


2 

s 


s 


N 


2 

s 


s 


.1 


180.78 


13449.27 


115.97 


364 . 56 


31279.43 


176.85 


.2 


101.78 


3306.46 


57.50 


179.04 


3345.18 


57.83 


•3 


69.95 


710.12 


26.64 


130.40 


1437.18 


37.91 


.4 


52.55 


324.99 


18.02 


108.65 


1160.87 


34.07 


.5 


42 .25 


139-14 


11.79 


77.25 


367.14 


19.16 


.6 


39.60 


121.41 


11.01 


72.70 


171.69 


13.10 


• 7 


36.55 


128.05 


11.31 


67.45 


130.26 


11.41 


.8 


28.70 


31.48 


5.61 


62.05 


115.31 


10.73 


• 9 


28.00 


38.31 


6.18 


53-55 


67.31 


8.20 


1.0 


28.80 


29.64 


5.44 


52.25 


39.77 


6.30 


1.5 


23.40 


7-93 


2.81 


44.00 


26.94 


5-19 


2.0 


22.40 


4.98 


2.23 


42.45 


12.05 


3-47 


2-5 


20.90 


5-25 


2.29 


41.55 


9-20 


3.03 


3-0 


21.65 


7.60 


2.75 


40.95 


13.83 


3.72 
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7- Summary and Conclusions . We remarked in the introduction on 



the paucity of nonparametric sequential procedures, particularly those 
based on ranks of observations . The author feels that the absence of 
a natural way of assigning ranks to observations, as the observations 
are taken, without reranking, was a significant cause for the lack of 
such procedures. The sequential ranking schemes defined and studied in 
this dissertation provide us with methods whereby ranks may be assigned 
in just such a manner. 

In order to use the methods of sequential parametric hypothesis 
testing (Wald's sequential probability ratio test) in our nonparametric 
setting, we must replace the sequence of observations X^, X^, ... 
by a sequence of ranks R^, R^, ... and base the test on the probability 
ratio of the ranks. This can be done by the sequential ranking scheme 
defined in Section 3* One basic nonparametric problem is the two- 
sample problem where we must decide whether or not an X- population 
and a Y- population have the same probability distribution. This 
problem was treated in Section 4 in the special case where the alter- 
natives are of the form proposed by Lehmann [l]. However the method 
proposed in Section 4 is general in the sense that in order to carry 
out the test one must only be able to compute P(U^ < < ... < \J ) 

where the U’s are X's and Y's. In general this computation is 
difficult, but for special alternatives where the computation is fea- 
sible, the method in Section 4 applies directly. 

Notice that in the finite sample size problem nothing is sacrificed 
by ranking sequentially (Theorem 3*l) instead of using ordinary ranks. 

In fact a little is gained inasmuch as the sequential ranks may be 
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viewed as a transformation of the dependent ordinary ranks into the 



independent sequential ranks. 

Merely ranking observations tells us nothing of their location, 
except relative to each other. In order to take into account the 
location of each observation relative to the origin as well as its 
size (absolute value) and relative location, the method of signed 
sequential ranking was devised. Contrary to sequential ranks, signed 
sequential ranks obtained from independent identically distributed 
observations are not independent in general. A sufficient condition 
on the distribution of the observations is given in Theorem 5-1 to 
insure that the signed sequential ranks will be independent. In the 
process control problem we used signed sequential ranks of observations 
whose distributions satisfied this condition. This simplified the 
calculations since sums of independent random variables were involved 
in the analysis. 

The methods of sequential ranking and signed sequential ranking 
proposed in this dissertation are new, as far as the author can deter- 
mine, and provide a natural way of assigning ranks to observations 
which fits into the theory of sequential analysis (hypothesis testing) 
and sequential procedures (process control) . All the attendant distri- 
bution theory results are new and the condition of Theorem 5*1 which 
insures the independence of signed sequential ranks is the only one 
known to the author. 

There are many areas for further investigation suggested by this 
research. In the sequential probability ratio test of Section b we 
did not use the sequential ranks explicitly (except for Z in equation 
(4.2)) in the definition of the probability ratio S . can be 
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written in terms of (Z^, Z , ... , Z^) , the sequential ranks, but the 
expression is quite complicated and it is much more convenient to use 
(4.l) and (4.2) which incorporate the most recent sequential rank only. 
Thus the behavior of was obtained by reference to ... A^. 

More general results are needed as to the probability of termination of 
P^(Z^)/Pq(Z^) for alternatives other than Lehmann alternatives. This 
is necessary because under the alternative hypothesis the sequential 
ranks are not independent generally and the conservative approximations 
A = 1-P/l -CL remain valid for successive dependent observations when the 
probability is one that the procedure will ultimately terminate . 

A second area for further study is the evaluation of the rule 
given in Section 6 for process control problems when changes from F 
to G are not of the form presented (e.g. G(x) = F(x + A) A > 0) . 

Also there are other ad hoc rules which could be proposed using signed 
sequential ranks (or sequential ranks) in process control problems. 
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